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What is claimed is: 



6. 



8. 



canceled) rewritten/re-presented in claim 5 
canceled) rewritten/re-presented in claim 6 
canceled) rewritten /re-presented in claim 7 
canceled) rewritten/re-presented in claim 8 
canceled) rewritten/re-presented in claim 9 
canceled) rewritten/re-presented in claim 10 
canceled) rewritten/re-presented in claim 11 
canceled) rewritten/re-presented in claim 12 



9. (re-presented - part of formerly independent claim 5) A method of evaluating whether 
an observed sequence of speech, image strip, or proteins has a subsequence being 
generated by one of a set of Hidden Markov Models (HMMs), comprising: 

a) preprocessing the observation with any standard technique (like LPC or MFCC for 
utterances, choosing the section to be analyzed for images and proteins) to obtain 
a sequence X (which is a temporal sequence of speech feature vectors, respectively 
a linear spatial sequence of features for proteins and sections in images); 

b) selecting a set of candidate patterns (like keywords, objects, respectively protein 
sequences for which we want to verify the existence in the current observation) 
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represented as hidden Markov models (HMMs) with a filler state qa at beginning 
and one at the end; 

c) selecting a confidence measures of the matching between a pattern and a subse- 
quence of the observation, chosen among the following three ones: 

cl) the accumulated posterior, normalized with the length of the matched subse- 
quence b being the index of the first and e the index of the last element 
of the subsequence in X (aka. simple normalization) 

c2) a value obtained by partitioning the HMM states into subsets called 
phonemes, defined by a method Phonemes(Q) that returns the segmenta- 
tion of a path Q in the HMM into subsets of contiguous states, each subset 
belonging to a distinct phoneme, and computing one of: 

c2a) the worst average match in a phoneme, called real fitting, 
argminf max H rz. , . — ) 

Q y Q£Phonemes{Q) |{%* € Q}\ 

c2b) double normalization of the accumulated posterior over the number of 
phonemes, J, and over the number of acoustic samples, ej — + 1, where 
ej is the time frame where Q enters phoneme j, and bj is the exit time 
frame from phoneme, j, 
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d) selecting a number called threshold, selection that can be done by user according 
to her experience with her application and environment; 

e) computing for each candidate HMM pattern a number called 'score for the best 
matching with a subsequence of the observation', 

by using a process that considers emission probabilities of xq as zero, generating 
iteratively for each pair (x^ cp) between an element Xi of X and a state q* of the 
current HMM, in the order of increasing i, a set of possible alternative paths in 
the HMM, that end in and generate X{\ 

this set of paths being obtained by extending the paths associated with all the 
pairs (xi-i,g*) containing the previous element of X, (the empty path at Xi), 
and extended with transitions allowed by the analyzed HMM, each path being Q 
recorded by storing the length spent by Q in each phoneme (equivalent to storing 
the indexes of X where Q enters and exits the different phonemes), updating the 
previously chosen confidence measure of the obtained path, 
and prunning the sets according to rules based on these confidence measures, 
namely where: 

i) the simple normalization confidence measure is used with a safe pruning that 
discards a path Qi given the existence of an alternative path Q 2 in the same 
set, whenever 5 2 < Si and L x < L2, where Si and Li respectively S 2 and 
L 2 are the minus of the cumulated log of posteriors along the paths, and 
the lengths of the paths, for the paths Qi respectively Q 2i and where the 
comparison is optionally optimized by sorting competing paths based on their 
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cost according to a merge-sort procedure; 

ii) the double normalization confidence measure, on HMMs where no path skips 
any phoneme, is used with a safe pruning that discards a path Q\ given the 
existence of a path Q2 whenever one of the following tests succeeds: 

(a) l 2 >lu A > 0, B < 0 and L\A + L C B + C > 0 

(b) l 2 >l u A>0, J?>0andC>0 

(c) l 2 >l u A < 0, C > 0 and I? A + LB + C > 0 

(d) l 2 >l u A = Q,B<0mdLB + C>0 

where we denote by a% 9 pu Zi, respectively by 02, P2 and k the confidence 
measure for the previously visited phonemes, the posterior in the current 
phoneme and the length in the current phoneme for the path Q u respectively 
the path and we also use the notations A = a\ — 0,2, B = {a\ — aa)(h + 
k)+Pi-P2, C=(ai-a2)lil2+Pih-P2h, L = ^max-max{^ J / 2 }, L c = -B/2A 
and Lmox is the maximum acceptable length for a phoneme; 
and where each set of paths may optionally be reduced by storing only the 
best K matches; 

iii) the double normalization confidence measure, on HMMs where some paths 
skip phonemes, is used with a safe pruning that discards a path Qi given the 
existence of a path Q 2 whenever Z 2 > h, A > 0, pi > P2 respectively F 2 >Fi, 
where F x respectively F 2 are the number of visited phonemes for paths Q x 
and Q2] 

iv) the real fitting is used with the safe pruning: Q 2 is discarded in favor of 
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another path Qi if the confidence measure of the Real Fitting for the previous 
phonemes is inferior (higher in value) for Q 2 compared with Q u and if pi < P2 
and I2 < h, 

where pi, fi, respectively p2> k represent the minus of the logarithm of 
the cumulated posterior respectively the number of frames in the current 
phoneme for the path Q x respectively Q 2) 

and besides the previously mentioned safe pruning, heuristic prunings are also 
used for removing paths when p > L^^j^ in any state or when * > p max 
at the output from a phoneme, where p and 1 are the values in the current 
phoneme for the minus of the logarithm of cumulated posterior and for the 
length of the path that is discarded; 

and where each set of paths may optionally be reduced by storing only the 
best K matches; 

f) returning as result the pattern with the highest score together with the score, or 
the set of all patterns with scores higher than the threshold, optionally with the 
boundaries of the subsequences of X that yield these scores. 

10. (re-presented - part of formerly independent claim 5) A method of evaluating whether 
an observed sequence of speech, image strip, or proteins has a subsequence being 
generated by one element of a set of Hidden Markov Models (HMMs), comprising: 

a) preprocessing the observation with any standard technique (like LPC or MFCC for 

utterances, choosing the section to be analyzed for images and proteins) to obtain 
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a sequence X (which is a temporal sequence of speech feature vectors, respectively 
a linear spatial sequence of features for proteins and sections in images); 

b) selecting a set of candidate patterns (like keywords, objects, respectively protein 
sequences for which we want to verify the existence in the current observation) 
represented as hidden Markov models (HMMs); 

c) selecting a number called threshold, selection that can be done by user according 
to her experience with her application and environment; 

d) computing for each candidate HMM pattern a number called 'score for the best 
matching with a subsequence of the observation', or at least the information about 
whether this score is higher or lower than the threshold, 

by using a method that applies Viterbi decoding for a HMM obtained by extending 
the initial one with a filler state just after start and one just before the termination 
state, and estimates the emission probability of the filler states in an iterative 
manner as being equal to 

for the path Q* with highest score found in the previous iteration, where b and e 
being the indexes of X between which Q* visits the HMM of the pattern, 
and where the emission probability in the filler states in the first iteration can be 
initialized to any floating point number, but the iteration stops: 

i) at convergence yielding the estimation of the boundaries and score of non- 
filler states of the HMM, 
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ii) when the confidence measure descends under a threshold value, T, estimating 
only the existence of a subsequence generated by the HMM, 

iii) when the emission probability of filler states, £o is initialized with T and is 
reestimated, as value of ex at the end of the first iteration, to be higher than 
T deciding that no subsequence was generated by the HMM, 

e) returning as result the pattern with the highest score that is higher than the 
threshold, or the set of all patterns with scores higher than the threshold, option- 
ally with the boundaries of the subsequences of X that yield these scores. 

(re-presented - formerly dependent claim 8) The method of claim 9, where it carries out 
the estimation of the existence of objects and their position in images, characterized 
by the fact that 

the HMM patterns are built by describing sections through views of objects, 

the emission probabilities are computed as a distance between colors (as a Gaus- 
sian with median at the color of the pattern, or a normalized inverse of the 
Euclidean distance in the RGB space), 

wherein the Hidden Markov Models that model the objects can be structured of 
distinct regions, that play in the frame of the method the role of the phonemes, 

and wherein the properties of the transitions of the HMM models of the objects are 
optionally modified in a dynamic manner for each path during decoding (existence 
and probability) by increasing/decreasing transition probabilities returning to the 
same state when matches in 'phonemes' were longer/shorter in the path than the 
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average predicted by the pattern. 

12. (re-presented - formerly dependent claim 8) The method of claim 10, where it carries 
out the estimation of the existence of objects and their position in images, characterized 
by the fact that 

the HMM patterns are built by describing sections through views of objects, 

the emission probabilities are computed as a distance between colors (as a Gaus- 
sian with median at the color of the pattern, or a normalized inverse of the 
Euclidean distance in the RGB space), 

and wherein the properties of the transitions of the HMM models of the objects are 
optionally modified in a dynamic manner for each path during decoding (existence 
and probability) by increasing/decreasing transition probabilities returning to the 
same state when matches in 'phonemes' were longer /shorter in the path than the 
average predicted by the pattern. 
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